AITopics | base mdp

Collaborating Authors

base mdp

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Review for NeurIPS paper: On Efficiency in Hierarchical Reinforcement Learning

Neural Information Processing SystemsJan-24-2025, 03:34:24 GMT

Quoting from the reviewers: R1: The paper presents a novel framework for analyzing potential efficiencies in reinforcement learning due to hierarchical structure in MDPs. This framework formally defines several useful concepts (subMDPs, equivalent subMDPs, exit states and exit profiles) that allow for an elegant refinement of regret bounds in a well-defined regime. The identification of particular properties (subMDPs, exit state set, and equivalence of subMDPs) provides a clear and useful framework for theoretical analysis of hierarchical reinforcement learning. Overall this paper provides an elegant, concrete framework for formalizing hierarchical structure and quantifying the efficiency such structure may allow. The paper provides a theoretical analysis of hierarchical reinforcement learning, deriving results on learning and planning efficiency when the reinforcement learning problem has repeated structure.

exit profile, hierarchical reinforcement learning, neurips paper, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Iterative Bounding MDPs: Learning Interpretable Policies via Non-Interpretable Methods

Topin, Nicholay, Milani, Stephanie, Fang, Fei, Veloso, Manuela

arXiv.org Artificial IntelligenceFeb-25-2021

Current work in explainable reinforcement learning generally produces policies in the form of a decision tree over the state space. Such policies can be used for formal safety verification, agent behavior prediction, and manual inspection of important features. However, existing approaches fit a decision tree after training or use a custom learning procedure which is not compatible with new learning techniques, such as those which use neural networks. To address this limitation, we propose a novel Markov Decision Process (MDP) type for learning decision tree policies: Iterative Bounding MDPs (IBMDPs). An IBMDP is constructed around a base MDP so each IBMDP policy is guaranteed to correspond to a decision tree policy for the base MDP when using a method-agnostic masking procedure. Because of this decision tree equivalence, any function approximator can be used during training, including a neural network, while yielding a decision tree policy for the base MDP. We present the required masking procedure as well as a modified value update step which allows IBMDPs to be solved using existing algorithms. We apply this procedure to produce IBMDP variants of recent reinforcement learning methods. We empirically show the benefits of our approach by solving IBMDPs to produce decision tree policies for the base MDPs.

agent, base mdp, mdp, (15 more...)

arXiv.org Artificial Intelligence

2102.13045

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.82)

Industry:

Government > Military (0.68)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Add feedback